System installation of python and associated python packages is not always what you need. The development of scientific software usually happens faster and is not aligned with OS releases (true for both Mac and linux). On Mac the situation is confounded by the inclusion of both 32 and 64 -bit python libraries.
The solution is to create an isolated (almost) self-contained environment that has all the packages you need for a particular project. This is called virtualenv
.
The program that tracks the list and the state of installed packages in the system is called pacakge manager.
apt
, apt-get
, aptitude
- Debian based linuxesbrew
- for Mac (?)These tools install, upgrade and uninstall software system-wide, resolving dependencies on the fly.
However, python programs and packages are also distributed via pip
. pip
is cross-platform, while the above tools are platform-specific. Mixing the two is not good.
The current best practice is to install system-level stuff using your OS package manager and use pip
only to install stuff inside virtualenvs.
venv
(https://docs.python.org/3/library/venv.html) (formely virtualenv
)This is the original, "pythonic" way to do this. As of python 3.4 venv
as a part of standard library
As of python 3.6 pyvenv script is deprecated.
Use python3 -m venv /path/to/environment
$ pyvenv ~/.venv/biodata3
$ source ~/.venv/biodata3/bin/activate
(biodata3)$ deactivate
(biodata3)$ pip install -U pip
(biodata3)$ pip install jupyter
(biodata3)$ pip freeze > requirements.txt
(biodata3)$ less requirements.txt
(biodata3)$ ls -lah ~/.venv/biodata3
(biodata3)$ pip install -r requirements.txt
sometimes the order of installation is important and pip
can't figure it out on its own:
(biodata3)$ cat requirements.txt | xargs -n 1 -L 1 pip install
(biodata3)$ pip install git+https://github.com/eco32i/ggplot.git@rewrite
or clone the repo and install locally like so:
(biodata3)$ git clone https://github.com/eco32i/ggplot.git
(biodata3)$ cd ggplot && git checkout rewrite
(biodata3) ggplot$ pip install -e .
conda
This tool is developed and maintained by Continuum Analytics, the company behind Anaconda python distribution. First install miniconda
to keep things nice and lean: https://conda.io/miniconda.html
$ conda create --name biodata3 python=3 pandas
$ conda create --name biodata python=2 numpy matplotlib
$ conda info --envs
$ source activate pydata3
(biodata3)$ deactivate
conda
and pip
(biodata3)$ conda install numpy
(biodata3)$ pip install ipython
(biodata3)$ conda env export > environment.yml
(biodata3)$ less environment.yml
(biodata3)$ conda env create -f environment.yml
(biodata3)$ deactivate
$
Whatever you install while in the environment will only be available inside that environment. Note how the output of which python
command changes when inside an environment.
In [ ]: